gh-144888: Replace bloom filter linked lists with continuous arrays to optimize executor invalidating performance by cocolato · Pull Request #145873 · python/cpython

cocolato · 2026-03-12T16:21:16Z

During JIT compilation, when function objects are destroyed or code objects are modified, all executors must be traversed to inspect their dependencies, followed by invalidating the relevant executors. The original implementation stored executors using singly linked lists, resulting in numerous pointer jumps during traversal and consequently poor CPU cache efficiency.

This PR changes the executor storage structure from a linked list to a contiguous array, reducing pointer jumps during traversal to improve CPU cache efficiency. It also implements O(1) deletion using swap-remove, thereby accelerating dependency invalidation operations.

Issue: JIT executor invalidation appears to be slow #144888

markshannon

Thanks for doing this.
I've only had time to do a quick scan, but this looks like it should speed up the scan considerably.

Include/internal/pycore_optimizer.h

markshannon · 2026-03-12T18:29:41Z

Include/internal/pycore_optimizer.h

    _PyBloomFilter bloom;
-    _PyExecutorLinkListNode links;
+    int32_t bloom_array_idx;        // Index in interp->executor_blooms/executor_ptrs.
+    _PyExecutorLinkListNode links;  // Used by deletion list.


Is this necessary now? We can traverse all executors using the executor_ptrs array.

I think we need it to save deletion list:

cpython/Python/optimizer.c

Lines 332 to 338 in 08a018e

static void

uop_dealloc(PyObject *op) {

_PyExecutorObject *self = _PyExecutorObject_CAST(op);

executor_invalidate(op);

assert(self->vm_data.code == NULL);

add_to_pending_deletion_list(self);

}

cocolato · 2026-03-13T02:25:31Z

@Fidget-Spinner gentle ping, if you have time ,please take a look at this, thanks!

Fidget-Spinner · 2026-03-13T03:31:09Z

Do you have benchmarks for this? A microbenchmark is fine.

cocolato · 2026-03-13T09:39:52Z

I wrote a microbench:

bench.py:

import time
N = 1000
ROUNDS = 10
for r in range(ROUNDS):
    classes = []
    for i in range(N):
        cls = type(f"C{i}", (), {"val": i})
        ns = {"cls": cls}
        exec("def f(n):\n o=cls()\n s=0\n for j in range(n): s+=o.val\n return s", ns)
        classes.append((cls, ns["f"]))
    for _, f in classes:
        for _ in range(200):
            f(10)
    t0 = time.perf_counter_ns()
    for cls, _ in classes:
        cls.val = -1
    elapsed = time.perf_counter_ns() - t0
    print(f"round {r}: {elapsed / 1e3:.1f} us  ({elapsed // N} ns/scan)")

test.sh:

export PYTHON_JIT_STRESS=1
echo "=== Baseline (linked list) ===" 
./python_base.exe /tmp/bench_bloom.py 
echo && echo "=== Optimized (contiguous array) ==="
./python.exe /tmp/bench_bloom.py

result:

=== Baseline (linked list) ===
round 0: 5111.4 us  (5111 ns/scan)
round 1: 6275.0 us  (6274 ns/scan)
round 2: 5421.2 us  (5421 ns/scan)
round 3: 5388.6 us  (5388 ns/scan)
round 4: 6240.9 us  (6240 ns/scan)
round 5: 6356.5 us  (6356 ns/scan)
round 6: 6139.0 us  (6139 ns/scan)
round 7: 6383.8 us  (6383 ns/scan)
round 8: 5474.0 us  (5474 ns/scan)
round 9: 6461.1 us  (6461 ns/scan)

=== Optimized (contiguous array) ===
round 0: 2657.8 us  (2657 ns/scan)
round 1: 2792.6 us  (2792 ns/scan)
round 2: 2861.5 us  (2861 ns/scan)
round 3: 2766.0 us  (2765 ns/scan)
round 4: 2650.5 us  (2650 ns/scan)
round 5: 2689.4 us  (2689 ns/scan)
round 6: 2691.5 us  (2691 ns/scan)
round 7: 2769.4 us  (2769 ns/scan)
round 8: 2864.1 us  (2864 ns/scan)
round 9: 2786.0 us  (2786 ns/scan)

cocolato · 2026-03-13T09:41:42Z

However, since the time spent on sacn is too small compared to warmup, I did not observe any noticeable performance improvement in fastmark.

Fidget-Spinner · 2026-03-13T09:51:05Z

That's an excellent result!

replacing Bloom filter linked lists with contiguous arrays

b45cbd3

bedevere-app bot mentioned this pull request Mar 12, 2026

JIT executor invalidation appears to be slow #144888

Open

cocolato added the skip news label Mar 12, 2026

markshannon reviewed Mar 12, 2026

View reviewed changes

remove unnecessary field

aad35f1

cocolato marked this pull request as ready for review March 13, 2026 01:43

cocolato requested review from AA-Turner, FFY00, Fidget-Spinner, ZeroIntensity, brandtbucher, diegorusso, ericsnowcurrently and savannahostrowski as code owners March 13, 2026 01:43

bedevere-app bot added the awaiting review label Mar 13, 2026

This comment was marked as outdated.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-144888: Replace bloom filter linked lists with continuous arrays to optimize executor invalidating performance#145873

gh-144888: Replace bloom filter linked lists with continuous arrays to optimize executor invalidating performance#145873
cocolato wants to merge 2 commits intopython:mainfrom
cocolato:gh-144888

cocolato commented Mar 12, 2026 •

edited by bedevere-app bot

Loading

Uh oh!

markshannon left a comment

Uh oh!

Uh oh!

markshannon Mar 12, 2026

Uh oh!

cocolato Mar 13, 2026

Uh oh!

cocolato commented Mar 13, 2026

Uh oh!

Fidget-Spinner commented Mar 13, 2026

Uh oh!

This comment was marked as outdated.

cocolato commented Mar 13, 2026

Uh oh!

cocolato commented Mar 13, 2026

Uh oh!

Fidget-Spinner commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	static void
	uop_dealloc(PyObject *op) {
	_PyExecutorObject *self = _PyExecutorObject_CAST(op);
	executor_invalidate(op);
	assert(self->vm_data.code == NULL);
	add_to_pending_deletion_list(self);
	}

Uh oh!

Conversation

cocolato commented Mar 12, 2026 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

markshannon Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

cocolato Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

cocolato commented Mar 13, 2026

Uh oh!

Fidget-Spinner commented Mar 13, 2026

Uh oh!

This comment was marked as outdated.

cocolato commented Mar 13, 2026

Uh oh!

cocolato commented Mar 13, 2026

Uh oh!

Fidget-Spinner commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cocolato commented Mar 12, 2026 •

edited by bedevere-app bot

Loading